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Abstract 

Study of major histocompatibility complex (MHC) loci has gained great popularity in recent years, partly due to their 
function in protecting vertebrates from infections. This is of particular interest in amphibians on account of major threats 
many species face from emergent diseases such as chytridiomycosis. In this study we compare levels of diversity in an 
expressed MHC class II locus with neutral genetic diversity at microsatellite loci in natterjack toad [Bufo (Epidalea) calamita) 
populations across the whole of the species' biogeographical range. Variation at both classes of loci was high in the glacial 
refugium areas (REF) and much lower in postglacial expansion areas (PGE), especially in range edge populations. Although 
there was clear evidence that the MHC locus was influenced by positive selection in the past, congruence with the neutral 
markers suggested that historical demographic events were the main force shaping MHC variation in the PGE area. Both 
neutral and adaptive genetic variation declined with distance from glacial refugia. Nevertheless, there were also some 
indications from differential isolation by distance and allele abundance patterns that weak effects of selection have been 
superimposed on the main drift effect in the PGE zone. 

Citation: Zeisset I, Beebee TJC (2014) Drift Rather than Selection Dominates MHC Class II Allelic Diversity Patterns at the Biogeographical Range Scale in 
Natterjack Toads Bufo calamita. PLoS ONE 9(6): el00176. doi:10.1371/journal.pone.0100176 

Editor: Sofia Consuegra, Swansea University, United Kingdom 

Received July 2, 2013; Accepted May 23, 2014; Published June 17, 2014 

Copyriglit: © 2014 Zeisset, Beebee. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits 
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: This research was funded by the Leverhulme trust (http://www.leverhulme.ac.uk). Grant number: F/00230/AH. The funders had no role in study design, 
data collection and analysis, decision to publish, or preparation of the manuscript. 

Competing Interests: The authors have declared that no competing interests exist. 

* E-mail: i.zeisset@brighton.ac.uk 



Introduction 

The basis of adaptive rather than neutral genetic variation has 
become increasingly accessible in recent years as loci under 
selection are identified and characterised. Some of the most 
popular genes used in this context are those belonging to the major 
histocompatibility complex (MHC). These genes play an impor- 
tant role in the adaptive immune response of vertebrates. MHC 
class I molecules present intracellular pathogen peptides to CD8-I- 
T lymphocytes (T cells), primarily in response to viral infections, 
whereas MHC class II molecules (composed of ot and P subunits) 
present extracellular pathogen peptides to CD4-I- T cells after 
invasion by bacteria and fungi [1]. Although there is some 
variation among vertebrates in MHC gene structure, MHC class 
II (3 genes in the amphibian Xempus laevis are made up of six exons 
with an exon-intron organization similar to that of a typical 
mammalian class II P gene. Exon 2 encodes the P - 1 domain 
which includes most of the antigen binding sites (ABS) of the beta 
domain and is the most polymorphic region of the gene [2] , [3] . 
Due to high selective pressure on MHC genes, variation tends to 
be high, particularly at the ABS [4] . These sites encode amino acid 
residues involved in the recognition and binding of foreign 
peptides [5]. 

Frequency dependent selection, where bearers of common 
alleles are likely to be more susceptible to diseases and specific 
alleles can confer resistance [6] , [7] or heterozygote advantage [8] , 
[9] are believed to be common mechanisms involved in shaping 



MHC diversity. Low diversity at MHC loci has been implicated in 
elevating vulnerability to disease [10], [11], though several species 
have not shown demonstrable iU effects from limited MHC 
variation [12], [13]. Single MHC class I or II alleles confer 
resistance to specific diseases in many taxa [14], [15], [16] and 
some studies suggest that it is the prevalence of parasites which 
maintain high levels of MHC variation [17], [18], [19]. However, 
in many cases stochastic events, rather than selection influence 
MHC variation and variation at neutral markers is often 
correlated with that of MHC loci, e.g. [11], [20], [21]. The 
mechanisms driving MHC evolution have therefore not been fuUy 
resolved. Natural selection, demographic processes such as drift 
and gene flow as well as mutation rate are all likely to play a role. 

Few studies have investigated MHC variation and neutral 
genetic variation across entire biographical ranges of species and 
most of these involve only few populations or species within limited 
ranges [12], [22], [23], [24], [25], [26]. It is therefore particularly 
interesting to compare diversity at neutral markers and function- 
ally important genes in species at wide biogeographical scales. In 
particular, species with large ranges and which have been 
subjected to population botdenecks in areas of their distribution 
are ideal for comparisons of neutral and adaptive genetic 
variation. Postglacial expansion of amphibians from glacial refugia 
provides useful examples. Most European species survived the last 
glaciations, peaking around 20,000 years ago, in southern refugia 
from which they subsequently colonised northern Europe in the 
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postglacial Holocene period [27]. Furthermore, amphibians are 
experiencing high rates of global decline [28], [29], mainly due to 
habitat degradation and loss [30] but also because of emerging 
infectious diseases such as the chytrid fungus Batrachochyhium 
dendrobatidis (Bd, [31]). This pathogen precipitated the decline of at 
least two species in parts of Iberia [32], but many other infected 
species appear largely unaffected [33], [34], [35]. As MHC class II 
molecules play an important role in mounting acquired immune 
responses to fungi, and MHC variability is often associated with 
immunocompetence [36], it is likely that MHC- dependent 
resistance mechanisms contribute to fighting Bd infections [37]. 
A recent study on leopard frogs, for example, showed an 
association between MHC class II genotypes and survival of Bd 
infections [38] . Information on MHC loci, and MHC class II P 1 
genes in particular, is now available for several amphibians 
including members of the genus Bomhina, Alyte.s [39], Rana [40], 
[41], Bufo [42], [43], Espadarana and Sachatamia [44] and Triturus 
[12], [45], as well as for model organisms such as Xenopus, Silurana 
and Ambystoma [46], [47], [48], [49]. 

Here we report the results of a study of MHC and microsatellite 
diversity across the entire biogeographical range of Bufo calamita, 
an amphibian with glacial refugia in Iberia and south-west France, 
which now also inhabits much of north and central Europe [50]. 
We recendy characterized the entire exon 2 of an expressed MHC 
class II P locus (locus B, [43]) in B. calamita and here we provide 
evidence that this locus is or has been under selection in this 
species. We then tested the hypothesis that effects of selection on 
this locus during postglacial expansion resulted in patterns of 
diversity different from those of microsatellites, which were 
presumed to be primarily consequent on genetic drift. 

Materials and Methods 

Sampling Strategy 

For MHC analyses we extracted DNA from 325 individuals 
from 17 populations of B. calamita distributed over the entire 
species' range (see Figure 1 and Table 1). Thirteen of those 
populations were used in previous studies of microsatellite diversity 
[50], [51] but samples from five further populations used in those 
studies were no longer available (grey circles in Figure 1). We 
therefore supplemented the study with samples from four new sites 
(white circles in Figure 1) to maintain coverage of the fuU 
biogeographic range. In all cases free swimming tadpoles (Gosner 
stages 25-30) were sampled and instantly sacrificed by immersion 
in ethanol (a method approved by the British Home Office). AU 
UK samples were authorized and licensed by Natural England, 
the statutory government organisation responsible for wildlife 
conservation. Bufo calamita is a protected species in Britain and in 
some of the other countries providing samples. In all cases the 
appropriate permissions were obtained by the samplers in those 
countries. Bufo calamita is a vertebrate but no ethical permissions 
were required for this study because it only required instant 
sacrifice of larvae, which does not come under ethical coding since 
no manipulations, mutilations or other stresses were applied. DNA 
extractions were performed as described in Zeisset & Beebee [40] . 
Four of the 1 7 populations were located in the glacial refugium 
area (REF, as per [12]) of Iberia and southern France, while the 
other 13 were located in the postglacial expansion area (PGE, as 
per [12]), all as inferred from mtDNA haplotype diversity [50]. 

MHC Genotyping 

MHC class II locus B P chain exon 2 sequences of B. calamita 
were amplified using primers located in the flanking intron 
regions. This is the only functional MHC class II locus thus far 



identified in B. calamita [42], [43]. The forward primer 2F347 
(GTGACCCTCTGCTCTCCATT) witii reverse primer 2R307b 
(ATAATTCAGTATATACAGGGTCTCACC) amplified a se- 
quence of 279-282 base pairs (excluding primers). The 20 |ll 
PCRs contained approximately 25 ng DNA, 0.4 )lM of each 
primer, 100 |a,M dNTPs, Ix reaction buffer and 0.5 U of New 
England Biolabs Taq DNA polymerase. Thermal cychng consisted 
of an initial denaturation step of 94°C for 3 min and a touchdown 
protocol with a total of 35 cycles and ending with a elongation step 
of 72°C for 10 min. Each cycle started with a denaturation step of 
94°C for 30 sec and ended with an elongation step of 72°C for 
40 sec. Annealing temperatures consisted of 2 cycles at 62°C, 2 
cycles at 60°C, 2 cycles at 58°C and 29 cycles at 56°C, each for 
30 sec. Individual alleles were identified by SSCP analysis as 
described in Zeisset & Beebee [40] . Bands were excised from gels, 
re-amplified following Sunnucks et al. [52] and sent away for 
sequencing (Macrogen, Korea or Oxford Biochemistry Dept, UK). 
To reduce the risk of including PGR artefacts each allele was 
sequenced at least twice, either from different individuals or from 
two separate PCRs. 

Microsatellite Data 

For comparative purposes we used microsatellite data obtained 
in previous studies from eight polymorphic loci in 600 individuals 
sampled from 18 populations [50], [51] and distributed over the 
entire biogeographical range (Figure 1, Table 1). To minimize 
PGR and scoring errors a small subset of samples with high 
incidences of non-amphfication or difficult to score alleles were run 
twice. Microsatellite data were tested for the presence of nuU 
alleles and scoring errors using Micro-Checker 2.2.3 [53] and for 
effects of selection using the Fgx outlier approach implemented in 
LOSITAN [54], [55]. 

MHC Sequence Analysis 

To determine intron/exon boundaries we combined the 
putative intron and exon sequences (Genbank nos.: JX258913 
andJX258914) to obtain a 532 base pair sequence of -fi. calamita 
locus B allele 'Buca B2' and used NNSPLICE version 0.9 [56], as 
implemented on http://www.fruitfly.org/seq_tools/splice.html, to 
predict sphce sites. 

Sequences were aligned and edited manually using Bioedit v. 
7.0.9 [57]. The relative rates of non-synonymous (dN) and 
synonymous (dS) base pair substitutions were calculated according 
to Nei & Gojobori [58] applying the Jukes Cantor correction [59] 
for multiple hits in Mega 5 [60] . This was done for all sites and just 
for putative antigen binding sites (ABS), assuming functional 
congruence to human ABS identified by Tong et al. [61]. We used 
a Z-test [62] for positive selection. We also calculated average 
pairwise nucleotide distances (Kimura 2-parameter model, K2P) 
and Poisson-corrected amino acid distances [63] for ABS, non- 
ABS and all sites in Mega 5 with 1000 bootstrap rephcates to 
calculate standard errors for the distance measures. 

To identify specific sites under selection we used two new 
methods: a mixed effects model of evolution (MEME) to identify 
instances of both episodic and transient positive selection at 
individual sites [64] as well as a fast unbiased Bayesian 
approximation (FUBAR), both implemented on http:/ /www. 
datamonkey.org [65]. MEME is superior at detecting sites where 
episodic positive selection is likely to be occurring [64]. For these 
analyses we used 282 bp (94 amino acids) of 57 locus B alleles of 
three species, B.calamita, Bufo bufo and Bufo (P.seudoepidaka) viridis 
(Genbank nos.: HQ388288, HQ388291, JX258874-JX258913, 
JX046488-JX046501, JX258919). After testing for recombina- 
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Figure 1. Natterjack toad distribution and sampling sites for microsatellite and MHC analyses. Darl< shading indicates natterjacl< toad 
distribution in Europe. Blacl< circles indicate sites for which both MHC and microsatellite data was collected, white circles indicate MHC data only and 
grey circles microsatellite data only. Sampling sites (area or nearest town) were: 1. Algarve, Portugal; 2. Seville, Spain; 3. Dohana, Spain; 4. Almeria, 
Spain; 5. Leon, Spain; 6. Bordeaux, France; 7. Carmargue, France; 8 and 9 near Zurich, Switzerland; 10. Carnac, France; 11. Penmarch, France; 12. 
Cherbourg, France; 13. Kerry, Ireland; 14. Birkdale, UK; 15. Winterton, UK; 16. Texel, Netherlands; 17. Halle, Germany; 18. Bukowno, Poland; 19. 
Bielowieza, Poland; 20. Zealand, Denmark; 21 . Uddevalla, Sweden; 22. Parnu, Estonia. The dashed line indicates the approximate division between the 
glacial refugia (REF) and postglacial expansion (PGE) areas. Map modified from d-maps.com. 
doi:1 0.1 371/journai.pone.01 001 76.g001 



tion, a phylogenetic tree was inferred and used as the input for 
selection on particular codons using the two methods. 

To investigate the evolutionary relationship betw(;en th(; MHC 
loci in the three Bufo species we used three methods were to test for 
signatures of recombination. This analysis was carried out for 
282 bp of exon 2 sequence as well as for the 157 bp we used in 
phylogenetic: tree reconstruction to investigate the effects recom- 
bination may have on tree construction. For this analysis we also 
included B. calamita, B. bufo and B. viridis locus A alleles (locus A is a 
putative non-functional locus, identified in an earlier study [43], 
(Genbank nos,: JX258916, JX283352, JX283353, JX258920, 
JX046502-JX046504) and used GENECONV [66] and MaxChi2 
[67], both implemented in RDP,3.44 [68]. Both of these methods 
performed well in a comparison of 14 recombination detection 
methods [69] . We applied Bonferroni correction for multiple tests 
and used automask sequences to optimize our dataset and to 
reduce the severity of the multiple testing correction. In addition 
we used a genetic algorithm recombination detection method 
(GARD; [70]), as implemented on http://www.datamonkey.org/ 
GARD. 

MHC Phylogeny 

We constructed a phylogenetic tree to visualize the relationship 
between anuran MHC class II fi exon 2 alleles based on a total of 
51 unique exon 2 sequences from 14 species: Bufo bufo, B. viridis, 
Bombina bombina, B. variagata, B. pachypus, Alytes obstetricans, Xenopus 
laevis, Rana tempomria, R. catesbeiana, R. yavapaimsis, R. clamitans. R. 
sylvatica, Sachatamia ilex and Espadarana prosohlepon [43], [39], [41], 
[2], [46], [44] in the NCBI database, in addition to a selection of 
our own from this study chosen to include the some of the most 



diverse sequences. Sequences were trimmed to 157 bp to match 
available data from the published exon 2 sequences. The urodele 
Ambystoma tigrinum and Tritums cmiatus MHC sequences were 
included as outgroups. To investigate the evolutionary relationship 
among the 38 B. calamita locus B sequences from this study we 
constructed another phylogenetic tree, using 282 bp of sequence 
and X. laevis as an outgroup. For both trees evolutionary history 
was inferred using the Maximum Likelihood method based on the 
Kimura 2-parameter model [71] in Mega 5 [60]. Other tree 
building methods were also tested and provided congruent results 
(data not shown). A consensus tree was inferred from 1000 
replicates [72]. As recombination and possible gene duplication 
can affect phylogenetic trees we also constructed a phylogenetic 
network using the program SplitsTree4, which can account for 
conflicting signals from recombination and gene duplication [73], 
[74] for the B. calamita MHC class II locus B. We used Jukes- 
Cantor distances and the Neighbor-Net method. For a network 
depicting the relationship between aU three Bufo species' MHC 
sequences see [43]. 

Population Genetics 

Compliance with Hardy-Weinberg equilibrium (HWE) in each 
population was assessed for microsatellite and MHC loci by 
applying the exact tests in Genepop 4.0.10 [75]. Linkage 
disequilibrium of the microsatellite markers was also tested in 
Genepop. 

F-statistics [76], pairwise muMocus permutation tests of 
population differentiation, expected heterozygosity (He) and allelic 
richness (i.e. the mean number of alleles corrected for samples size; 
R), were estimated for each population and overall in FSTAT 
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2.9.3.2 [77]. We used Fgx primarily as a way to measure the level 
of difierentiation between populations. As Fgx niay be affected by 
highly variable markers such as microsateUites we also calculated 

Dest [7fS] in GenAlEx 6.5b3 [79], [80]. Pairwise comparisons of 
Fgi- and T)es,t within and between each class of loci, as well as 
isolation by distance (using In distance vs. Fgx/ 1 "Fgx), were made 
using the Mantel test facility in Genepop with 1000 permutations. 
Distances were measured using Google Earth between all 
sampling sites north of th(^ Pyrcmccs (the region of postglacial 
expansion) that were not s(;parat(;d by sea, as most amphibians 
cannot survive seawater exposure. Distances were ()thc'r\vise direct 
(Euclidean) allowing for bends to avoid sea water where necessary. 

To investigate how variation was partitioned within and among 
REF and PGE populations we carried out an AMOVA in 
Arlequin v. 3.5 [81] for microsatellite and MHC data. 

Phylogx'ographic relationships among tlic populations based on 
allele frequencies were determined separately for microsatellite 
and MHC data using Phylip v. 3.66 [82]. The analysis employed 
CavaUi-Sforza chord distances and the UPGMA algorithm with 
1000 bootstraps for the multUocus microsatellite data. 

Standard statistical tests for differences in allelic richness (R) and 
expected heterozygosity (He) between REF and PGE populations 
as well as correlations between microsatellite and MHC data and 
microsatellite and aUehc richness (R) and geographic distance were 
carried out using Statistix7 (Analytical Software, Tallahassee, 
USA). 

Results 

MHC Locus B Characteristics 

SpUce site analysis for locus B predicted intron/ exon boundaries 
between base pairs 3 and 4 after the 3' end of the forward primer 
binding site and between base pairs 281 and 282 (2 bp into the 
reverse primer binding site), making the putative exon 279 base 
pairs long. These sites corresponded to exon 2 boundaries found in 
some other amphibian species [39]. There was evidence of 
historiccil positive natural selection at ABS sites in B. calamita 
(P = 0.002, Z = 2.973). Using MEME (P<0.1) we identified 23 
positively selected codons in exon 2 and using FUBAR with a 
posterior probability >90% we found 19. Codons identified by 
both methods largely corresponded to putative antigen binding 
sites (ABS) as defined by Brown et al. [83] and Tong et al. [61] for 
the human MHC locus HLA-DRB (Figure 2). Average nucleotide 
distance over all nucleotide sequence parrs in exon 2 was 0.094 (SE 
0.012) in B. calamita. Average amino acid distance at this locus was 
0.149 (SE 0.028). Nucleotide and amino acid distances were much 
higher in the putative ABS (nucleotide: 0.376, SE 0.067; amino 
acid: 0.787, SE 0.159) than in non-ABS (nucleotide: 0.058, SE 
0.010; amino acid: 0.083, SE 0.021) sites. 

Of the 38 unique MHC class II exon 2 DNA sequences 
(Genbank nos.: h43882288, HC)388289, JX25887,o-JX258911) 
from locus B in 1 7 populations of B. lalamita, ten showed evidence 
of a recombination event including a codon insertion towards the 
3' end resulting in the addition of threonine. Five had a deletion of 
three nucleotides which corresponded to a codon deletion in the 
MHC class II DAB alleles of the great crested newt, Triturus 
cmtatus (Trcr-DAB*06, 08, 12, 15, 17. 19, 20 and 24, [84]) in 
Europe, as well as in the glass frog Espadarna prosohlepon (Espr- 
DRB*26, [44]) in Central America. In all three species phenyl- 
alanine or tyrosine was lost as well as a number of others in the 
case of the glass frog. 

Recombination tests GeneConv and MaxChi detected between 
two and three recombination events among all the whole exon 
2 MHC sequences, with between five and 15 recombination 



signals. GARD on the other hand detected no evidence of 
recombination. For the 157 bp sequences used in phylogenetic 
tree reconstruction only MaxChi detected one recombination 
event, with 13 recombination signals. 

Amphibian MHC class II exon 2 sequences formed some 
strongly supported clusters (Figure 3A). Rana, Xmopus and 
Discoglossoidea (Bombina and Alytes species) sequences all formed 
separate groups. Within those groups there was also strong support 
for some branches separating species. The Central American 
Srif/mtamia and Espndariina clustered strongly with the European 
Bufo. Both th(; phylogenetic network (Figure SI) and the tree 
(Figure 3B) produced congruent results for the B. calamita MHC 
class II B locus. 

Population Genetics 

For the MHC locus there were a total of 12 alleles in 256 
individuals in the PGE group, but 30 alleles in 69 individuals in the 
REF group. There was a remarkably high number of population- 
specific alleles (25 out of 38, 66%) in the MHC data, each of which 
was found in a single population (Table 1). Twenty six alleles 
(68°/()) including 20 population- specific alleles (80'X)) were only 
found in the REF group and eight alleles (five population- specific) 
only in the PGE group. Only four alleles (10%) occurred in both 
groups, though in the REF group they were only found in the SW 
France population and not in any of the Iberian populations. None 
of the alleles found in Iberia occurred north of the Pyrenees and 
vice versa. However, the alleles in the REF and PGE groups did 
not cluster as similar sequences (Figure 3B) implying a common 
ancient ancestry for the alleles occurring in both groups combined. 
Three of the 17 populations (Germany, Spain (Dofiana) and 
Portugal) were not in HWE at the MHC locus, in all cases due to a 
homozygote excess. As selection can generate divergence from 
HWE, we did not exclude these populations from further analysis. 
Although the presence of null alleles could not be ruled out, the 
fact that the primers were located in relatively conserved intron 
sequences means that null alleles are unlikely to be the cause of the 
deviation from HWE. Among the microsateUites LOSITAN 
identified one locus (Bcal|J,8) as possibly affected by positive 
selection and we therefore excluded this marker from subsequent 
analysis. The number of alleles in the remaining seven microsat- 
ellite loci ranged from 8 (Bcaln,2) to 25 (Bcaln,3). Diversity 
measures for microsatellite and MHC loci are given in Table 1 
and allele frequencies in Table A in File SI. Almost all of the 
microsatellite alleles were encountered at least twice, either in the 
same or in different populations and errors due to PCR or scoring 
are likely to be small. PCR repeats of individuals never gave 
conflicting results. Again, diversity was lower in the PGE group 
than in the REF group. There were 35 population- specific 
microsatellite alleles out of a total of 118 (30%) across all loci. A 
total of 52 alleles (44%, range 29-50%) including 26 population- 
specific alleles (74%) were found only in the REF group and 16 
alleles (nine private) only in the PGE group. Across all loci fifty 
alleles (42%, range 25-50%) occurred in both groups. No 
microsatellite locus therefore showed allelic difierentiation north 
and south of the Pyrenees as great as that shown by the MHC 
locus. 

Twenty-four of 126 population x microsatellite loci compari- 
sons showed significant deviations from HWE after Bonferroni 
correction. In all cases this was due to a homozygote excess and 
Micro-Checker indicated the possibility of nuU alleles in several 
populations at Bcal^tl, 2, 3, 4 and 6 with estimated frequencies 
ranging from 0.09 to 0.32 ([85], estimator 1; Table B in File SI). 
The number of loci with homozygote excess was particularly high 
in the German and Spanish (Seville) populations. Null alleles may 
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Figure 2. Amino acid alignment of a subset of MHC class 11 sequences. MIHC class II amino acid sequences of B. calamita (BC), B. bufo (BB) and 
6. viridis (BV). Position 2 is the first amino acid position in exon 2 in these species according to splice site analysis. Position 5 corresponds to the first 
amino acid position of the second exon in the human MHC locus HLA-DRB. Shaded columns represent putative antigen binding sites (ABS) according 
to Brown et al. (1993) and/or Tong ef al. (2006). Sign at position 77 denotes a codon deletion; signs 'x' and '*' indicate amino acid positions under 
positive selection as determined by a mixed effects model of evolution (MEME) and a fast unbiased Bayesian approximation (FUBAR), respectively. 
The positive selection analysis was based on 57 alleles from locus B. 
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therefore be a cause of the deviation from HWE. Another possible 
explanation is that in some populations a proportion of the 
samples consisted of siblings, although measures were taken to 
avoid sampling family groups [50]. However, as this was a small 



proportion of the total number of populations we did not exclude 
these from further analysis (see also [50]). 

After Bonferroni corrections there were no cases of linkage 
disequilibrium among the loci. 
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Figure 3. Phylogenetic tree of anuran exon 2 MHC class II nucleotide sequences. A: Multispecies comparisons using 157 bp of sequence. 
Triturus (Trcr) and Ambystoma (Amti) sequences were used as outgroups. Genbank accession numbers are given in brackets. B: 6. calamita alleles with 
Xenopus outgroup using 282 bp of sequence. Filled circles, alleles only found in the PGE; open circles, alleles only found in SW France and PGE. 
Remaining alleles were only found in the REF populations. A ML bootstrap consensus tree from 1000 replicates [82] was constructed in Mega 5 [60]. 
The evolutionary distances were computed using the Kimura 2-parameter method [71]. Only bootstrap values above 50% are shown. 
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Comparison of MHC and Microsatellite Variation 

For both microsatellite and MHC markers most \ ariation was 
aeeounted for at the witliin-population lev('l (49 "'o and 42% 
respectively) or among populations witliin REF and PGE groups 
(39% and 51% respectively, Table 2). However, the mean number 
of alleles corrected for sample size (allelic richness, R) was 
significantly higher in the REF than in the PGE for both 
microsatellite and MHC loci (MHC: X^ = 4.20, DF= 1, 
P = 0.0404; microsatellites: = 6.92, DF= 1, P = 0.0085). MHC 
and microsatellite allelic richness was significantly correlated 
among populations where both types of loci were- sampk-d 
(Figure 4A, r3 = 0.8371, n=15, P = 0.0001). Microsatellite and 
MHC allelic richness outside Iberia declined with distance from 
the SW France rcfugium area (Figure 4B, r^ = —0.7899, n= 14, 
P = 0.0013). Expected lu-terozygosity between the types of loci was 
also significantly correlated (Figure 4C, rs = 0.8312, n=15, 
P = 0.0002). None of the correlations were unduly influenced by 
populations with a significant amount of null alleles (see Figure 4A 
and Figure 4C as well as Table A and Table B in File SI) and the 
correlations between heterozygosity and allelic richness for both 
types of loci were also significant excluding SW Spain and 
Germany from the analysis (Hg: r, = 0.7552, n= 13, P = 0.0001; 
R: r, = 0.7872, n = 13, P = 0.000). Average Fis values of MHC and 
microsatellite loci across aU populations in this PGE zone were low 
in both cases (0.019 and 0.028 respectively) and not significanfly 
different (Wilcoxon signed rank test, P = 0.603). 

Pairwise Fgi- comparisons among populations indicated signif- 
icant population differentiation between 128 of the 136 population 
comparisons for MHC loci and between all but three for the 
microsatellite data (see Table SI). Using Mantel tests, pairwise FgT 
and Dest were significanfly intercorrelated both for MHC and 
microsatellite loci for the PGE region including SW France 
(MHC: r, = 0.6372, n= 105, P<0.0001; microsatellite: r, = 0.2548, 
n= 105, P = 0.0089). However, in several cases where geograph- 
ical separation was high, MHC Dggi = 1 thus providing incom- 
plete resolution of differentiation level. Subsequent comparisons 
therefore focused on pairwise F^t estimates which were correlated 
between MHC and microsatellite loci (r, = 0.3128, n=105, 
P = 0.0012). Excluding Iberian populations, among areas analysed 
for both microsatellites and MHC genotypes and not separated by 
seawater (n = 8; Bordeaux, SW France; Zurich, Switzerland; 
Camac, France; Penmarch, France; Cherbourg, France; Halle, 
Germany; Bukowno/Bielowieza, Poland; Parnu, Estonia), the 
correlation between MHC and microsatellite pairwise Fgx 
estimates was also strong (r, = 0.432, P = 0.025). Mean pairwise 
Est estimates in this region were similar for both types of loci 
(0.428 for microsatellites, 0.487 for MHC) and there was 
significant isolation by distance (IBD, P<0.0001 in both cases). 
However, the strength of IBD was greater for MHC than for 
microsatellites as shown using untransformed F^t and distance 
estimates in Figure 5. Regression slopes for the two loci were 
significanfly different (F = 6.14, P = 0.0165). This strongly indicates 
a role of selection in shaping MHC diversity, as the effects of drift 
on microsatellite Fgx estimates are expected to be greater than 
those on MHC loci, due to their higher mutation rates. 

Phylogeographic trees based on microsatellite and MHC allele 
frecjuencic's were broadly congruent (Figure 6). However, allele 
frequencies and distributions in the PGE region were significanfly 
different between the loci (Figure 7). We excluded colour coding 
for the MHC locus in the Iberian populations from this 
comparison, as they do not share any alleles with the other 
populations and contain a large number of population specific 
alleles. For a fuU comparison see Table A in File SI for allele 
frequencies at all loci in all populations. Certain MHC alleles were 
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C: Correlation of MHC and microsatellite expected heterozygosity (He). 
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common in adjacent geographic areas (e.g Buca B2 in Ireland, 
UK, Netherlands, Germany and Sweden, Buca B5 in Sweden, 
Denmark, Estonia, Poland, Switzerland and Germany) (Figure 7A). 
No such pattern could be discerned for the most polymorphic 
microsatellite locus Bcal|a,3 (Figure 7B). 
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Table 2. AMOVA results for MHC and microsatellite loci. 







Among groups 


Among populations within groups 


Within populations 


microsatellites 


1 1 .86* 


39.44* 


48.69* 


MHC 


6.37* 


42.38* 


51.25 ns 



Percentage of variation explained by among group {PGE and REF), among population within groups and within population. Associated significance values of variance 
components based on 1023 permutations: ns = non significant, * = P<0.001. 
doi:l 0.1 371 /journal.pone.Ol 001 76.t002 



Discussion 

Nucleotide distances in the MHC locus (0.094) were compara- 
ble to those found in other amphibian species, where diversity 
ranged from 0.062 in Ram warszewitschii to 0.155 in R. catesheiana 
[41], [39], [40]. Many of the positively selected sites we identified 
in exon 2 corresponded to those involved in peptide binding in 
equivalent human MHC class II proteins [83], [61]. Many of these 
sites were congruent with human ABS identified by either Tong 
et al. [61] or Brown et al. [83] (Figure 2). The others were located 
outside the human ABS and some of the human ABSs were not 
identified as positively selected sites in the Bufo MHC class II loci. 
Similar findings have been reported by others (e.g. [86], [87], [84], 
[41]), indicating that species- specific selection pressures were 
acting on the MHC genes. Two methods for detecting recombi- 
nation in die Bufo MHC sequences indicated its occurrence in 
these species. This may explain the adjacent intron 2 sequence 
similarity found at locus A and B within B. calamita [43] . Although 
our phylogenetic analysis was based on only a short DNA 
sequence, it does reflect the current view of the phylogenetic 
relationship of amphibians [88], [89], [90]. Interestingly the 
Central American glass frog Espadamna prosoblepon, which clustered 
with Bufo in our phylogenetic tree, also showed remarkable 
sequence similarity to Bufo MHC sequences in the first 200 bp of 
intron 2, which basically consists of a 100 bp repeat [43], [44]. 
Although we did not sequence intron 2 of B. viridis or B. bufo the 
fact that the intron- specific primers amplify MHC loci in these 
species indicates at least some conservation at introns across taxa 
[43]. Blast searches did not find these sequences elsewhere. Such 
conservation across widely separate taxa may indicate some 
functional significance of this sequence. For example, the first 




0.0 -I 1 1 1 1 , 1 1 

0 500 1000 1500 2000 2500 3000 

Distance (Km) 

Figure 5. Relationships of pairwise Fst and distance estimates. 

Microsatellite data are represented by open circles and MHC data by 
filled circles. 
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130 bp of intron 2 sequences of New World ranid species {Rana 
catesheiana, R. clamitans, R. pipiens, R. sylvatica, R. yavapaiensis, R. 
warszweitschii, R. palustris; [41]) were also remarkably conserved 
across species, though no repeat was detected. Sato et al. [91] 
analysed 1 1 4 intron 2 sequences of passerine bird species and 
found that most of them largely consisted of repeat sequences, with 
a 10 bp repeat being particularly common. In addition they found 
a 60-80 bp DNA segment in intron 2 that occurred in noncoding 
segments of MHC sequences in a number of other passerine bird 
species. The function or role of repetitive sequences or conserved 
elements in intron 2 is not yet clear but clearly requires further 
study. 

Of particular interest within the B. calamita MHC were alleles 
(all in Iberian REF populations) that had a codon deletion at the 
same position as found in the great crested newt Triturus cristatus in 
Romanian REF populations [12] as well as in the glass frog 
Espadarna prosoblepon (Espr-DRB*26, [44]) in Central America. It 
may be that selection pressure in colder climates eliminated these 
alleles from populations in North European amphibians, or that 
they confer advantages in warmer climates. The loss of these 
alleles by drift as populations expanded north cannot be ruled out 
but seems a remarkable coincidence for two unrelated taxa. 

There was a clear difference in the levels of both MHC and 
microsatellite diversity between the REF and the PGE populations 
of B. calamita. The lack of shared MHC alleles between the REF 
and PGE populations was surprising and it is possible more shared 
alleles may be found in other REF populations. Nevertheless, for 
comparison, Babik et al. [12] found that populations of the great 
crested newt {Triturus cristatus) in post glacial expansion (PGE) areas 
possessed a subset of alleles from the refugia (REF) populations, 
when they compared three PGE populations from across Europe 
to only four small REF populations from Romania. Our data 
support the theory that natterjack toads survived the Weichselian 
glacial maximum 20 000 years BP in at least one north European 
refuge, most likely in France, and colonized northern Europe from 
there [50] . In B. calamita and T. cristatus variation of microsatellite 
and MHC loci was high in the REF groups and much lower in the 
PGE groups. A decrease in allelic diversity from southern to 
northern Europe is well documented in neutral loci (e.g. [92]). The 
high similarity in diversity distribution (decreasing in the PGE area 
as a function of distance from the REF area) and in phylogeo- 
graphic patterns between the two types of loci imply that drift 
rather than selection was the dominant influence on MHC allelic 
variation at the biogeographic range scale. Mean Fjg estimates 
were close to zero for both classes of loci, with no signal of 
heterozygote excess as an indicator of diversifying selection in the 
MHC locus. A recent meta-analysis of the roles of natural selection 
and genetic drift in shaping MHC variation concluded that 
selection combined with drift during population bottlenecks can 
result in loss of MHC polymorphism at even greater rates than 
neutral genetic diversity [93]. Other studies have shown that 
microsatellite and MHC diversity is lost in similar proportions over 
time, with balancing selection unable to mitigate genetic drift [94] 
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Figure 6. Phylogeography of B. calamita populations. Bootstrap values >50% are given for the microsatellite analysis. Sampling site numbers 
corresponding to Figure 1 are given in brackets. 
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and that MHC diversity declines more steeply than microsatellite 
diversity after a bottieneck [95]. However, inconsistencies remain 
and in some cases selection can maintain polymorphism at MHC 
loci during population bottlenecks (e.g. [96]). 

When comparing two difiFerent marker systems such as MHC 
loci and microsatellites it is important to consider the potential 
differences in the ages of observed alleles due to the higher 
mutation rates of microsatellites and potential back mutations. 
Microsatellite mutation rates in B. calamita have been estimated at 
a relatively low 1x10 ' [50], whereas the mutation rate at the 
DRBl locus in chimpanzees is estimated to be 1.31 xlO~^ per site 
per year [97]. Assuming a similar mutation rate for the MHC 
locus B in B. calamita this would give a mutation rate of 1.1 xlO~ 
for 279 bp of exon 2 per generation (three years), not much 



different from that of the microsatellites. Microsatellite mutation 
rates increase with microsatellite length and contractions become 
more likely than expansions as length increases [98]. As natterjack 
microsatellite length was generally higher in the PGE than in the 
REF area [50] it is possible that some variation was masked by 
back mutations generating homoplasy. However, the natterjack 
microsatellites were relatively short (mostly around 10-20 repeats 
with a maximum of 29 for Bcal|J.3) and significant homoplasy was 
considered unlikely. 

Despite the likely dominance of drift effects on B. calamita MHC 
variation across the species' range there were some interesting 
differences between MHC and microsateUite genotype distribu- 
tions that may imply an additional, albeit minor role of selection in 
structuring the MHC locus at this large geographical scale. MHC 





Figure 7. Allele frequency distributions in the PGE area. A: MHC locus allele frequency distributions. Colour coding for the Iberian populations 
is excluded from this figure due to the large number of population specific alleles present; B: Bcal(i3 locus allele frequency distributions. IVIaps 
modified from d-maps.com. 
doi:1 0.1 371/journal.pone.01 001 76.g007 
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alleles were more sharply differentiated between the REF and 
PGE regions than were the microsateUite alleles. Isolation by 
distance effects in those parts of the mainland Europe PGE area 
where gene flow remained possible after postglacial sea level rise 
was significantly stronger for the MHC than for the microsateUite 
loci. The commonest MHC alleles in the PGE group were-, on 
average, at higher frequency and more geographically clustered 
than the commonest alleles in the microsateUite locus with the 
most comparable aUelic diversity. This pattern difference might 
imply weak effects of selection reflected in patterns of common 
MHC allele abundance in specific rc'gions, such as north-west 
France, north-central Europe and eastern Europe, perhaps in turn 
reflecting local differences in pathogen communities. Pathogens 
often exhibit a 'latitudinal diversity gradient', with high diversity at 
the equator decreasing towards the poles [99]. For example, 
temperature is an indirect selective mechanism maintaining MHC 
diversity in wUd salmon [18]. Therefore it is possible that the 
higher MHC diversity in the southern populations of B. calamita is 
maintained by higher pathogen- mediated selection pressure, but 
further studies into actual difference of pathogen prevalence in the 
various regions are needed to test this hypothesis. 

Overall, our evidence clearly implies a stronger influence of drift 
than selection on this B. calamita MHC locus at the biogeograph- 
ical scale. This is essentially simUar to the situation discovered with 
a comparable study of another widespread European amphibian, 
Tritums cristatus [12]. Comparison of MHC and neutral loci in four 
populations of Adantic herring [Clupea harmgus) also failed to detect 
any evidence of selection acting on the MHC locus, although in 
this study a MHC-embedded microsateUite locus was used and it is 
not clear to what extent this reflected varialjilit)' in the exon [100]. 
Marsden et al. [24] also found that although microsateUite diversity 
and MHC diversity were correlated in African wild dog 
populations, indicating genetic drift to be a major influencing 
factor, there were signatures of selection at the MHC locus. The 
apparent weakness of selective effects may however be influenced 
by the scale of the study. In Atlantic salmon drift and migration 
were more important than selection on large geographical scales 
but at smaller geographical scales the influence of selection was 
detected at MHC loci [101]. Postglacial colonisation with 
associated bottlenecks can evidendy leave strong signatures of 
genetic drift long after the event. In contrast to this, the same 
MHC locus in a B. bufo population translocated over 400 km 
within the UK adapted within three generations to an aUele 
frequency distribution similar to that of neighbouring populations 
in the receptor area [102], presumably reflecting selection in 
favour of the new conditions. A recent phylogeographic study of 
the bank vole [Adyodes glareolus) found no spatial genetic structure 
among populations at MHC loci, but clear differentiation 
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Figure SI Phylogenetic network of MHC class II beta 
exon 2 sequences. Neighbor-Net tree based on Jukes-Cantor 
distances of 282 bp of sequence of MHC locus B from B. calamita. 
(TIF) 

Table SI Population differentiation as estimated by FgT 
values. Numbers in brackets refer to population numbers in 
Figure 1. Non-significant values are indicated in red italics. A: 
Population differentiation as estimated by Fgx values for MHC 
data. P-values obtained after: 2720 permutations. IncUcative 
adjusted nominal level (5%) for multiple comparisons is: 
0.000368. B: Population differentiation as estimated by Fst values 
for microsateUite data, based on 7 loci. P-values obtained after: 
3060 permutations. Indicative adjusted nominal level (5%) for 
multiple comparisons is: 0.000327. 
PCLSX) 

File SI Table A (sheet 1): AUelic frequencies in each population 
for microsateUite and MHC loci. Numbers in brackets refer to 
population numbers in Figure 1 . Table B (sheet two): Brookfield 1 
estimates of nuU aUele frequencies at microsateUite loci. 
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